Probabilistic Models for Disambiguation of an HPSG-Based Chart Generator

نویسندگان

  • Hiroko Nakanishi
  • Yusuke Miyao
  • Jun'ichi Tsujii
چکیده

We describe probabilistic models for a chart generator based on HPSG. Within the research field of parsing with lexicalized grammars such as HPSG, recent developments have achieved efficient estimation of probabilistic models and high-speed parsing guided by probabilistic models. The focus of this paper is to show that two essential techniques – model estimation on packed parse forests and beam search during parsing – are successfully exported to the task of natural language generation. Additionally, we report empirical evaluation of the performance of several disambiguation models and how the performance changes according to the feature set used in the models and the size of training data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust PCFG-Based Generation Using Automatically Acquired LFG Approximations

Wide coverage grammars automatically extracted from treebanks are a corner-stone technology in state-ofthe-art probabilistic parsing. They achieve robustness and coverage at a fraction of the development cost of hand-crafted grammars. It is surprising to note that to date, such grammars do not usually figure in the complementary operation to parsing – natural language surface realisation. Banga...

متن کامل

Probabilistic Disambiguation Models for Wide-Coverage HPSG Parsing

This paper reports the development of loglinear models for the disambiguation in wide-coverage HPSG parsing. The estimation of log-linear models requires high computational cost, especially with widecoverage grammars. Using techniques to reduce the estimation cost, we trained the models using 20 sections of Penn Treebank. A series of experiments empirically evaluated the estimation techniques, ...

متن کامل

Adapting a Probabilistic Disambiguation Model of an HPSG Parser to a New Domain

This paper describes a method of adapting a domain-independent HPSG parser to a biomedical domain. Without modifying the grammar and the probabilistic model of the original HPSG parser, we develop a log-linear model with additional features on a treebank of the biomedical domain. Since the treebank of the target domain is limited, we need to exploit an original disambiguation model that was tra...

متن کامل

Using an HPSG grammar for the generation of prosodic structures

In this paper, we report on an experiment showing how the introduction of prosodic information from detailed syntactic structures into synthetic speech leads to better disambiguation of structurally ambiguous sentences. Using modifier attachment (MA) ambiguities and subject/object fronting (OF) in German as test cases, we show that prosody which is automatically generated from deep syntactic in...

متن کامل

Stochastic HPSG Parse Disambiguation Using the Redwoods Corpus

This article details our experiments on hpsg parse disambiguation, based on the Redwoods treebank. Using existing and novel stochastic models, we evaluate the usefulness of different information sources for disambiguation – lexical, syntactic, and semantic. We perform careful comparisons of generative and discriminative models using equivalent features and show the consistent advantage of discr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005